Compiler-Assisted Checkpointing

نویسندگان

  • Micah Beck
  • James S. Plank
  • Gerry Kingsley
چکیده

In this paper we present compiler-assisted checkpointing, a new technique which uses static program analysis to optimize the performance of checkpointing. We achieve this performance gain using libckpt, a checkpointing library which implements memory exclusion in the context of user-directed checkpointing. The correctness of user-directed checkpointing is dependent on program analysis and insertion of memory exclusion calls by the programmer. With compiler-assisted checkpointing, this analysis is automated by a compiler or preprocessor. The resulting memory exclusion calls will optimize the performance of checkpointing, and are guaranteed to be correct. We provide a full description of our program analysis techniques and present detailed examples of analyzing three fortran programs. The results of these analyses have been implemented in libckpt, and we present the performance improvements that they yield.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiler-assisted Full Checkpointing

This paper describes a compiler-based approach to checkpointing for process recovery. The implementation is transparent to both the programmer and the hardware. The compiler-generated sparse potential checkpoint code maintains the desired checkpoint interval. Adaptive checkpointing reduces the size of the checkpoints. Training is used to select low-cost, high-coverage potential checkpoints. The...

متن کامل

CPPC: a compiler-assisted tool for portable checkpointing of message-passing applications

With the evolution of high-performance computing towards heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities. Whether due to a failure in the execution or to a migration of the application processes to different machines, checkpointing tools must be able to operate in heterogeneous environments. However, some of the data manipul...

متن کامل

Compiler Support for Fine-Grain Software-Only Checkpointing

Checkpointing support allows program execution to roll-back to an earlier program point, discarding any modifications made since that point. Existing software-based checkpointing methods are mainly libraries that snapshot all of working-memory, and hence have prohibitive overhead for many potential applications. In this paper we present a light-weight, fine-grain checkpointing framework impleme...

متن کامل

Compiler-Enhanced Incremental Checkpointing

As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures in that it allows applications to periodically save their state and restart the computation after a failure. Although a variety of automated s...

متن کامل

Compiler Supported Interval Optimisation for Communication Induced Checkpointing

There exist mainly three different approaches of checkpoint-based recovery mechanisms for distributed systems: coordinated checkpointing, uncoordinated checkpointing and communication induced checkpointing. It can be shown that communication induced checkpointing theoretically has the least minimum overhead, but also that the effective overhead depends on the communication behaviour and the res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994